Accessing Scientifgic Data: Simpler is Better
نویسندگان
چکیده
A variety of index structures has been proposed for supporting fast access and summarization of large multidimensional data sets. Some of these indices are fairly involved, hence few are used in practice. In this paper we examine how to reduce the I/O cost by taking full advantage of recent trends in hard disk development which favor reading large chunks of consecutive disk blocks over seeking and searching. We present the Multiresolution File Scan (MFS) approach which is based on a surprisingly simple and flexible data structure which outperforms sophisticated multidimensional indices, even if they are bulk-loaded and hence optimized for query processing. Our approach also has the advantage that it can incorporate a priori knowledge about the query workload. It readily supports summarization using distributive (e.g., count, sum, max, min) and algebraic (e.g., avg) aggregate operators.
منابع مشابه
SCIENTIFIC NOTE A Bioassay Method for Black Flies (Diptera: Simuliidae) Using Larvicides
This note presents an alternative method for bioassays using black fly larvae, with which we intend to provide a simpler but effective tool for accessing data on larvicide efficacy.
متن کاملData Mining for Very Busy People Tar2: a Simpler, Shorter Rule
F or 21st-century businesses, the problem is not accessing data but ignoring irrelevant data. Most modern businesses can electronically access mountains of data such as transactions for the past two years or the state of their assembly line. The trick is effectively using the available data. In practice, this means summarizing large data sets to find the “pearls in the dust”—that is, the data t...
متن کاملAccessing Full Text of Articles: A Study on the Status of Medical Universities in Tehran
Introduction. Due to the rapid development of information technology and world wide web, there is easy and fast access to medical information and medical journals. Although there is free and easy access to articles' abstracts through Medline on the internet, accessing full text articles still remains a problem. This study was carried out to investigate the best way we could access full text of ...
متن کاملComparison of Knowledge Levels Required for SNOMED CT Coding of Diagnosis and Operation Names in Clinical Records
OBJECTIVES Coding Systematized Nomenclature of Medicine, Clinical Terms (SNOMED CT) with complex and polysemy clinical terms may ask coder to have a high level of knowledge of clinical domains, but with simpler clinical terms, coding may require only simpler knowledge. However, there are few studies quantitatively showing the relation between domain knowledge and coding ability. So, we tried to...
متن کاملReport on CLEF-2002 Experiments: Combining Multiple Sources of Evidence
For our second participation in the CLEF retrieval tasks, our first objective was to propose better and more general stopword lists for various European languages (namely, French, Italian, German, Spanish and Finnish) along with improved, simpler and efficient stemming procedures. Our second goal was to propose a combined query-translation approach that could cross language barriers and also an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003